Refactor ConvertGenericCallExpr#145
Conversation
| using Kind = CallArg::Kind; | ||
|
|
||
| unsigned arg_begin = 0; // skip count for operator()'s implicit object arg | ||
| CallInfo info{}; |
| } | ||
| } | ||
|
|
||
| void Converter::EmitCall(CallInfo info) { |
There was a problem hiding this comment.
I changed to CalInfo&& info
| Materialized, | ||
| }; | ||
|
|
||
| clang::Expr *expr; |
There was a problem hiding this comment.
usually these are sorted by size (larger to shorter) to reduce struct size.
| std::optional<TempMaterializationCtx> ConvertCallExpr(clang::CallExpr *expr); | ||
|
|
||
| struct CallArg { | ||
| enum class Kind { |
| case Kind::Hoisted: | ||
| StrCat("let", | ||
| std::format("_{}: {}", ca.param_name, ToString(ca.param_type)), | ||
| "="); |
There was a problem hiding this comment.
Replaced with std::format("let {}: {} =")
| bool is_variadic; | ||
| bool is_fn_ptr_call; | ||
| std::vector<CallArg> args; | ||
| std::vector<clang::Expr *> variadic_args; |
There was a problem hiding this comment.
there's only up to one variadic arg. this can be a simple pointer initialized to null
There was a problem hiding this comment.
No, for example in the following code:
void foo(int a, ...);
foo(1, 2, 3);=>
args = 1,
variadic_args = 2, 3
And the Rust code is:
fn foo(a: i32, args: &[VaArg])
foo(1, &[2.into(), 3.into()])| std::string param_name; | ||
| std::string ref_temp_name; | ||
| clang::QualType param_type; | ||
| clang::Expr *expr; |
There was a problem hiding this comment.
switch param_type and expr
There was a problem hiding this comment.
The size of both param_type and expr is 8, I inspected using -fdump-record-layouts. Is there other reason for switching?
Dump of -Xclang -fdump-record-layouts
*** Dumping AST Record Layout
0 | struct cpp2rust::Converter::CallArg
0 | class std::basic_string<char> param_name
0 | struct std::basic_string<char>::_Alloc_hider _M_dataplus
0 | class std::allocator<char> (base) (empty)
0 | class std::__new_allocator<char> (base) (empty)
0 | pointer _M_p
8 | size_type _M_string_length
16 | union std::basic_string<char>::(anonymous at /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:204:7)
16 | char[16] _M_local_buf
16 | size_type _M_allocated_capacity
32 | class std::basic_string<char> ref_temp_name
32 | struct std::basic_string<char>::_Alloc_hider _M_dataplus
32 | class std::allocator<char> (base) (empty)
32 | class std::__new_allocator<char> (base) (empty)
32 | pointer _M_p
40 | size_type _M_string_length
48 | union std::basic_string<char>::(anonymous at /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:204:7)
48 | char[16] _M_local_buf
48 | size_type _M_allocated_capacity
64 | class clang::QualType param_type
64 | class llvm::PointerIntPair<class llvm::PointerUnion<const class clang::Type *, const class clang::ExtQuals *>, 3> Value
64 | struct llvm::detail::PunnedPointer<class llvm::PointerUnion<const class clang::Type *, const class clang::ExtQuals *> > Value
64 | unsigned char[8] Data
72 | clang::Expr * expr
80 | _Bool has_default
81 | Kind kind
| [sizeof=88, dsize=82, align=8,
| nvsize=82, nvalign=8]
|
|
||
| struct CallInfo { | ||
| clang::Expr *callee; | ||
| bool is_variadic; |
There was a problem hiding this comment.
is_variadic isn't equal to !variadic_args.empty()?
There was a problem hiding this comment.
No, a variadic function call be called with 0 variadic arguments, in which case we need to emit an empty &[VaArgs] like this: foo(0, &[])
| } | ||
| } | ||
|
|
||
| void Converter::EmitCall(CallInfo &&info) { |
There was a problem hiding this comment.
const CallInfo &
The function doesn't consume or take ownership
There was a problem hiding this comment.
It can't be const CallInfo & because EmitHoistedArgs which is called inside EmitCall takes a CallInfo&. But EmitCall can be CallInfo&
|
@copilot resolve the merge conflicts in this pull request |
Depends on Cpp2Rust#161. See Cpp2Rust#161 for a more detailed description of how goto is implemented. On the codegen part, this PR emits the goto_block and goto macros when it hits functions that use labels and gotos internally. It also hoists local variables at the start of the function so that all goto_block arms can see and use the local variables.
Changed the access pattern of flexible arrays from ```rs let bytes: [u8; 1] = ...; bytes[10] = ... ``` to ```rs let bytes: [u8; 1] = ...; *bytes.as_mut_ptr().add(10) = ...; ``` The original access panics because the bounds are violated, i.e. accessing the 10th element in an array declared with 1 element. The new access does not panic, it preserves the C/C++ behavior.
C allows name collision between the following 2 declarations:
```c
struct X {};
typedef enum {} X;
```
The 2 declarations live in different name spaces:
6.2.3 Name spaces of identifiers
[...] there are separate namespaces for various categories of
identifiers, as follows:
1. the tags of structures, unions, and enumerations (disambiguated by
following any) of the keywords struct, union, or enum);
2. all other identifiers, called ordinary identifiers (declared in
ordinary declarators or as enumeration constants).
`struct X {}` lives in namespace 1 and `typedef enum {} X` lives in
namespace 2.
We cannot translate both to `X` in the Rust code. We need to
disambiguate between the 2 names. I chose to translate them as:
```
struct X {} -> X
typedef enum {} X -> X_enum
```
C++ does not have the name space rule for identifiers. I added the
`tag->getASTContext().getLangOpts().CPlusPlus` check in
DisambiguateAnonymousTag to avoid polluting the C++ generated files.
---------
Co-authored-by: Nuno Lopes <nuno.lopes@tecnico.ulisboa.pt>
`"\xff"` or `'\xff'` are valid strings/char literals in C/C++. They were not escaped and produced an invalid character that broke the compilation. After I escaped them, I realized that they cannot be translated to `\xff` in Rust `str`'s because str only accepts `\x00 - \x7f`. Byte array accepts `> \x7f`. I changed Ptr::from_string_literal to accept a byte array instead of a str. I also changed GetFmtArg to refuse to create Rust `str` that contains chars `> \x7f` and fallback to GetRawArg that produces escaped byte arrays.
The generated `Ord::cmp` for method-based `operator<` used `other.<op>(other)` in its second branch, which can never evaluate true and therefore suppresses `Ordering::Greater`. This change aligns that branch with the intended reverse comparison (`other.<op>(self)`), matching existing model-specific behavior. Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
…st#175) `const char var[] = {"string literal"}` is a valid initialization. Add support for it in codegen.
`return std::format("anon_enum_{}", GetLineNumber(enum_decl));` results
in name clashes between enums declared in different files at the same
line, I replaced that with `return GetNamedDeclAsString(enum_decl);`
`libc::strlen(dummy as *const i8) as u64.into()` (without paren) errors because we can't write `as u64.into()`, we need to wrap the entire arg in a paren before calling .into(): `(libc::strlen(dummy as *const i8) as u64).into()`
A continue statement inside a `do while` breakes the iteration and
re-evaluates the condition.
Our previous translation, i.e.
```rs
loop {
...
if !cond {
break
}
}
```
never re-evaluated the condition after a continue, resulting in an
infinite loop.
To fix this, use while instead of loop. This ensures that the condition
is re-evaluated right after the continue:
```rs
let mut __do_while = true;
while __do_while || cond {
__do_while = false;
...
}
```
__do_while makes sure that the first iteration of the loop is always
executed.
…ust#178) `StrCat` calls were using single-character string literals in many places (e.g. `">"`-style delimiters and punctuation), which is inconsistent with the intended char-based usage for single codepoints. This updates those call sites to pass character literals instead. Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Done |
ConvertGenericCallExpr is a messy function and it became difficult to add new functoinality on top of it. So I decided to refactor it into:
EmitCall(CollectCallInfo(expr)).In the near future I plan to add 3 new features on top of it:
In the new version of ConvertGenericCallExpr, CollectCallInfo is responsible for returning the call information:
And EmitCall is responsible for emitting the hoisted arguments, the callee and the argument list using a CallInfo.
This model makes the addition of new strategies, for example
Don't hoist arguments that are integer literals, they don't alias with any other argument, easier because only the CallInfo fields have to be modified.